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AN  ARTIFICIAL  INTELLIGENCE  TECHNIQUE  FOR  AUTOMATING 
SEISMIC  STRATIGRAPHIC  INTERPRETATION1 

Scott  W.  Shaw  and  Rui  J.  P.  deFigueiredo 

Department  of  Electrical  and  Computer  Engineering 

Rice  University 
Houston,  Texas  77251-1892 
(713)  527-8101  ext.  3569 

Abstract 

'  Studying  the  character  of  reflected  seismic  wavelets  may  reveal  facts  about  the  stra¬ 
tigraphy  of  the  reflector.  Computers  can  aid  humans  in  this  task  by  revealing  structural 
similarities  between  various  wavelets.  The  relational  tree  is  a  good  way  to  represent  a 
waveform’s  global  structure.  By  representiing  a  waveform  as  a  relational  tree,  process¬ 
ing  it  symbolically,  and  clustering  the  processed  trees,  a  seismic  waveform  recognition 
system  can  be  constructed.  The  symbolic  processing  is  based  on  a  tree  transformation. 

An  objective  function,  which  measures  the  effectiveness  of  such  a  transformation,  util¬ 
izes  the  ratio  of  between-cluster  to  within-cluster  scatter.  The  action  of  a  tree  transfor¬ 
mation  applied  to  tree  spaces  is  the  same  as  linear  discriminants  applied  to  feature 
spaces.  When  tested  on  simulated  seismic  data,  the  relational  tree  waveform  recognition 
system  performs  well  at  high  signal-to-noise  ratios. 
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ABSTRACT 

Studying  the  character  of  reflected  seismic  wavelets  may  reveal  facts 
about  the  stratigraphy  of  the  reflector.  Computers  can  aid  humans  in  this 
task  by  revealing  structural  similarities  between  various  wavelets.  The 
relational  tree  is  a  good  way  to  represent  a  waveform's  global  structure.  By 
representing  a  waveform  as  a  relaional  tree,  processing  it  symbolically,  and 
clustering  the  processed  trees,  a  seismic  waveform  recognition  can  be 
constructed.  The  symbolic  processing  is  based  on  a  tree  transformation. 

An  objective  function,  which  measures  the  effectiveness  of  such  a 
transformation  utilizes  the  ratio  of  between-ctuster  to  within-cluster 
scatter.  The  action  of  a  tree  transformation  applied  to  tree  spaces  is  the 
same  as  linear  discriminants  applied  to  feature  spaces.  When  tested  on 
simulated  seismic  data,  the  relational  tree  waveform  recognition  system 
performs  well  at  high  slgnal-to-nolse  ratios. 

INTRODUCTION 

Currently  available  seimic  processing  techniques,  when  applied  to 
properly  acquired  data  can  produce  stacked  seismic  sections  which  contain 
a  large  amount  of  stratigraphic  Information.  If  a  hlgh-bandwldth  source 
wavelet  is  used,  analyzing  the  shape  of  a  reflected  event  may  reveal  details 
about  the  reflector  lithology.  In  this  paper,  we  show  how  computers  may  be 
used  to  automate  the  analysis  of  the  reflection  character.  The  developments 
here  are  based  on  the  premis  that  there  is  significance  in  the  lateral 
variation  of  the  reflection  character  in  a  seismogram.  For  this  reason  we 
postulate  that  computers  can  aid  humans  in  discovering  stratigraphically 
significant  anomalies.  We  wish  to  extract  information  from  individual 
waveforms  and  classify  reflections  based  on  waveform  morphology.  The 
technique  discussed  here  is  to  be  applied  to  sets  of  isolated  reflections. 

This  is  not  a  global  processing  technique  to  be  applied  to  entire  traces  over 
the  whole  data  set  as  are  the  operations  commonly  associated  with  seismic 
data  processing.  The  system  consists  of  local  actions  to  be  performed  on 
waveforms  which  have  been  conventionally  processed  end  then  extracted 
from  a  larger  data  set.  Our  technique  involves  symbolic  representations 
end  processing  routines  of  the  type  associated  with  artificial 
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intelligence. 

There  are  date  structures  which  represent  signals  in  such  a  way  that 
specific  waveform  properties  are  emphasized.  These  properties  may  have 
to  do  with  a  wavelet's  overall  structural  character.  Such  data  structures 
are  constrained  to  a  finite  set  by  some  grammar  which  describes  the 
waveforms  we  expect  to  encounter.  Given  these  syntactic  constraints, 
machine  processing  may  be  designed  which  intelligently  exploits  what  is 
already  known.  This  represents  an  algorithmic  approach  to  signal 
processing  and  classification.  Waveform  processing  techniques  to  date  have 
relied  heavily  on  signal  representations  consisting  of  regularly  spaced, 
sequential,  digital  samples.  The  algorithms  and  hardware  developed  so  far 
have  exploited  the  characteristics  of  this  type  of  representation  using 
matrix  manipulation  and  numerical  computation.  We  intend  to  describe  an 
alternate  stucture  for  signal  representation,  and  introduce  processing 
techniques  that  exploit  the  a  priori  constraints  on  this  representation. 

We  examine  some  of  the  techniques  developed  to  date  and  introduce  a 
waveform  recognition  system  that  utilizes  tree  signal  representations, 
symbolic  processing,  and  non-par ametric  clustering  techniques.  The 
system  will  then  be  applied  to  noisy  synthetic  seismic  data  so  that  we  may 
observe  the  resulting  classification  error. 

The  waveforms  in  question  are  represented  in  such  a  way  that  their  global 
structure  Is  emphasized.  This  representation,  known  as  the  relational 
tree,  describes  the  relative  placement  of  peaks  and  valleys  within  the 
waveform.  The  relational  tree  is  then  manipulated  by  a  tree 
transformation  so  that  critical  information  is  preserved,  and  superfluous 
information  is  stripped  away.  A  tree  cluster  objective  function  is 
introduced  to  measure  the  success  of  the  tree  transform.  Traditional 
cluster  analysis  techniques  are  then  adapted  to  classifying  individual  trees. 

WAVELET  CHARACTER  INTERPRETATION 

Analysis  based  on  visual  inspection  by  experienced  humans  remains  the 
most  reliable  seismic  interpretation  technique.  For  this  purpose,  The 
interpreter  must  assimilate  the  large  amount  of  information  available  in 
modern,  high-resolution  siesmic  data.  Much  of  this  information  lies  not  only 
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in  the  time  of  arrival  of  a  reflection  event,  but  also  in  the  character  of  the 
reflected  wavelet,  a  much  subtler  indication  of  geology.  Under  ideal 
conditions,  the  presence  of  reflectors  less  than  1/8  wavelength  thick  can 
cause  variations  in  reflection  character  (Sherrif,  1977).  It  is  not  clear, 
however,  that  the  human  eye  is  able  to  detect  these  variations.  A  complex 
reflection  may  be  made  up  of  the  superposition  of  many  wavlets  with  varying 
amplitude,  phase,  and  offset  (Oobrin,  1977).  Small  fluctuations  in  wavelet 
character  caused  by  changing  lithology  are  usually  ignored  in  favor  of  a 
macroscopic  view  of  the  subsurface.  Here  are  a  few  examples  where  the 
reflection  character  was  utilized  to  formulate  a  complete  subsurface 
interpretation. 

Numerous  case  studies  indicate  that  small  variations  in  reflector 
stratigraphy  lead  to  subtle  lateral  variation  in  reflection  character. 
Hydorcarbon  indicators  such  as  bright  spots  and  oil-gas  Interfaces  have 
long  been  known  by  seismic  interpreters,  but  small  changes  in  waveform 
character  may  indicate  more  about  lithology.  (Waters  and  Rice,  1975) 
showed  by  a  series  of  synthetic  seismograms  that  lateral  variations  in 
waveform  result  from  facies  changes  in  the  Pennsylvanian  Morrow 
Formation.  These  effects  were  then  compared  to  observed  reflection  data. 
In  a  study  over  a  producing  gas  field,  Focht  and  Baker,  (1985)  related  the 
seismic  effects  of  varying  porosity  and  gas  accumulation  in  the  Colony 
Sandstone  Formation  of  southern  Alberta.  They  Illustrated  how  constructive 
interference  between  various  reflectors  within  the  Colony  contributes  to  the 
total  reflection  signature.  As  the  stratigraphy  of  the  Colony  varies 
horizontally,  so  does  the  character  of  the  reflected  wavelet.  By  relating  this 
character  to  aynthetic  seismograms  taken  from  producing  wells,  Focht  and 
Baker  were  able  to  quantify  these  effects  and  predict  the  lithology  at 
undrilled  locations.  They  demonstrated  that  the  makeup  of  the  Colony 
reflection  at  a  certain  point  is  determined  by  the  amplitude,  polarity,  and 
offset  of  the  various  individual  interfaces  within  the  Colony  sand.  Chapman 
and  Schafers  (1983)  give  evidence  that  the  presence  of  a  shallow  sand 
channel  in  the  Illinois  Basin  produces  a  marked  reflection  character  change. 

Though  experienced  humans  are  currently  the  best  interpreters  of 
seismic  waveforms.  The  descisions  they  make  can  not  always  be  quantified 
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and  traced  to  some  underlying  physical  priciple.  Often,  they  are  based  more 
on  intuition  and  past  observations  rather  than  on  in-depth  understanding  of  a 
physical  model.  The  algorithms  associated  with  artificial  Intelligence  have 
been  developed  to  imitate  humans,  who  excell  at  this  type  of  heuristic 
reasoning.  Within  narrow  fields  of  expertise,  computers  can  sometimes 
perform  as  human  experts  do.  Presumably,  humans  derive  certain  cues 
from  waveform  structure.  If  a  machine  can  recognize  waveforms  by  using 
these  cues,  it  could  also  augment  those  cues  with  some  of  its  own, 
processing  individual  reflections  to  extract  information  that  cannot  be 
observed  through  visual  inspection. 

RELATED  WORK 

Previous  attempts  have  been  made  to  apply  pattern  recognition  techniques 
to  seismic  waveforms.  As  early  as  1969,  Mathieu  and  Rice  (1969)  extracted 
features  from  seismic  wavelets  and  used  multivariate  analysis  to  detect 
stratigraphic  anomalies.  Later,  Waters  and  Rice  (1975)  applied  statistical 
cluster  anaysis  in  the  search  for  statigraphic  traps.  Recently,  Sinvahl,  et. 
al.  (1984)  have  quantified  some  predictors  of  lithology  which  are  deriveable 
from  seismic  reflection  data.  Variables  such  as  autocorellatlon  values,  and 
reflected  frequencies  taken  at  various  points  were  used  as  features. 
Discriminant  analysis  was  then  applied  to  optimize  clustering,  and  reduce 
dimensionality.  An  attempt  was  made  to  model  lithologic  sequences  as 
Markov  processes.  Additional  discussion  on  the  application  of  pattern 
recognition  techniques  to  petroleum  exploration  data  is  found  in 
deFigueiredo  (1982). 

Little  attention  has  been  given  to  symbolic  waveform  representations  for 
seismic  Interpretation.  The  notable  exception  hss  been  Huang  and  Fu  (1986) 
who  applied  Syntactic  Pattern  Recognition,  with  Error  Correcting  Parsing, 
and  Hough  transforms  to  classifying  bright  spots  in  reflection  seismic  data. 
Both  Individual  waveforms  and  two  dimensional  reflector  shapes  were 
considered.  The  waveform  representation  consisted  of  strings  of  primitives 
which  described  the  slope  between  succestve  samples  along  the  wavelet. 

Lu  and  Cheng  (1985)  used  modifications  of  relational  trees  along  with  a 
sophisticated  tree  matching  procedure  to  perform  correlations  between 
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well-log  waveforms.  We  shall  adopt  a  similar  approach  for  our  work. 

Since  human  seismic  interpreters  frequently  comtemplate  waveform 
characteristics  which  are  not  strictly  based  on  a  physical  model  for  seismic 
reflection,  we  may  assume  that  the  techniques  of  visual  seismic  waveform 
classification  are  not  necessarily  limited  to  use  on  seismic  signals.  This 
allows  us  to  look  outside  the  field  to  other  areas  of  waveform  interpretation. 
Several  researchers  have  investigated  automated  syntactic/semantic 
electrocardiogram  (ECG)  analysis  algorithms.  Perhaps  by  examining  the 
work  done  here,  we  can  gain  some  insight  into  the  seismic  interpretation 
problem. 

Horowitz  (1975),  developed  a  grammar  based  technique  for  detecting  and 
labeling  significant  waveform  peaks.  The  waveforms  are  first  segmented  via 
a  split  and  merge  algorithm.  The  segments  are  then  given  labels,  allowing 
the  entire  waveform  to  be  parsed  according  to  a  predefined  grammar.  The 
parser  expects  strings  which  are  sequences  of  peaks  and  linear  segments. 

Papakonstantinou,  et.  al.  (1981),  worked  with  waveforms  described  by 
attributed  grammars.  These  are  grammars  which  are  augmented  by 
numerical  semantic  information.  Including  such  information  allows  a  parser 
not  only  to  describe  a  waveform's  structure,  but  to  Infer  meaning  from  that 
structure  as  well.  These  researchers  speculate  that  such  a  system  would  be 
useful  for  ECG  interpretation. 

An  elaborate  and  application-oriented  system  for  ECG  waveform 
interpretation  has  been  developed  by  Birman  (1982).  The  system,  known  as 
SEEK,  has  been  developed  as  an  aid  to  ECG  interpreters.  The  system  encodes 
ECG  waveforms  syntactically,  also  including  semantic  information  for  a 
richer  signal  description.  A  rule-based  expert  system  then  searches  for 
significant  patterns  within  the  waveform  and  labels  them  for  use  by  the 
interpreter.  Such  a  system  is  not  meant  to  replace,  but  to  assist  them. 

Recently,  Bunke  et.  al.  (1984)  investigated  syntactic  methods  for 
interpreting  heart-volume  curves.  Input  curves  are  represented  by  regular 
expressions,  then  the  network  of  all  possible  curves  is  searched  untill  the 
most  likely  match  is  found.  This  is  similar  to  error  correcting  parsing.  The 
system  associates  certainty  factors  with  strings  when  deciding  on  a  match. 

Most  of  the  techniques  described  above  rely  on  string  grammars.  While 
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strings  do  a  good  job  of  representing  the  sequence  of  patterns,  they  do  not 
sufficiently  describe  the  overall  global  structural  characteristics  that  we 
are  Interested  in.  To  capture  this  structural  information,  researchers  have 
turned  to  higher-dimensional  syntactic  representations.  The  simplest  of 
these  representations,  and  the  data  structure  we  shall  investigate  here,  is 
the  tree.  Trees  allow  a  hierarchical  description  of  signals.  Frequently, 
complex  waveforms  may  be  divided  into  a  few  major  substructures.  These 
in  turn  may  be  divided  and  subdivided  until  some  terminal  feature  is 
encountered.  We  shall  now  describe  a  tree  representation  which  segments 
waveforms  according  to  nested  peaks  and  valleys.  This  tree  structure  is 
known  as  the  relational  tree. 


THE  RELATIONAL  TREE  REPRESENTATION 
The  relational  tree  (RT)  provides  a  two-dimensional  description  of  a 
one-dimensional  signal  (Erich  and  Foith,  1976).  It  draws  on  the  intuitive 
notion  of  a  waveform  as  a  sequence  of  peaks  and  valleys.  The  RT  structure 
contains  only  information  about  the  relative  size  and  placement  of  peaks  and 
valleys  in  the  signal.  Attributes  may  be  added  to  the  nodes  of  the  tree  to 
supply  semantic  information  such  as  absolute  time  and  magnitude.  This 
structure  is  insensitive  to  monotonic  scaling  of  the  domain  axis. 

Each  non-terminal  node  In  an  RT  represents  a  valley  in  the  waveform. 

Each  terminal  node  represents  a  peak.  The  valleys  are  nested  according  to 
relative  depth.  The  root  node  of  the  RT  is  chosen  to  represent  the  deepest 
valley  in  the  waveform.  This  divides  the  waveform  into  two  segments;  one  to 
the  right  of  this  valley,  and  one  to  the  left.  The  descendants  of  this  node  will 
be  RT's  describing  each  segment.  Each  root  node  is  labeled  by  its  dominant 
peak,  i.e.  the  highest  peak  in  either  segment.  The  non- terminal  descendants 
of  any  node  represent  the  deepest  valleys  in  the  right  and  left  segments. 
They  are  in  turn  labeled  by  their  dominant  peaks  (see  figure  I  a  &  b).  If  a 
segment  contains  only  a  peak  and  no  valleys,  it  is  represented  by  a  terminal 
node  and  then  labeled  by  that  peak. 

An  Important  property  of  RT's  is  that  they  partition  the  set  of 
one-dimensional  functions  into  equivalence  classes.  The  partitions  may  be 
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viewed  as  clusters  in  a  pattern  space.  The  nature  of  the  relational  tree 
structure  allows  us  to  classify  functions  based  on  that  structure. 

The  node  labeling  scheme  adopted  by  Erich  and  Foith  was  sequential  as 
shown  in  figure  1.  This  labeling  scheme  leads  to  a  sort  of  context  sensitivity, 
i.e.  choosing  a  label  for  a  peak  depends  on  the  labels  which  have  already 
been  assigned  to  surrounding  peaks.  To  avoid  this  problem,  the  convention 
introduced  in  our  research  is  to  label  peaks  by  their  relative  si2e  within  the 
waveform  segment  (see  figure  2).  Peak  heights  are  scaled  and  quantized  to 
L  levels.  When  labeled  in  this  manner,  all  relational  trees  will  have  root 
label  PL_j.  The  smallest  peak  in  a  segment  will  have  label  Pg.  Although  this 

labeling  scheme  avoids  the  problem  of  context  sensitivity,  it  also  leads  to 
non-unique  labels  for  peaks.  A  further  modification  is  to  give  the  root  node 
a  unique  label,  PL.  This  allows  for  more  powerful)  tree  processing. 

Researchers  have  defined  alternate  tree  structures  for  describing 
waveforms  (Lu  and  Cheng,  I98_).  These  more  complex  tree  structures 
borrow  from  the  relational  tree  concept.  The  results  are  trees  whose 
topology  represents  vertical  and  horizontal  quantization  in  addition  to  the 
relative  placement  of  peaks  and  valleys.  For  this  work,  however,  we  desire 
the  simplicity  of  the  relational  tree  sturcture.  It  should  be  kept  in  mind  that 
any  of  these  more  complex  structures  could  be  substituted.  The  choice  of 
tree  structure  should  depend  on  the  amount  of  quantization  information 
required  for  the  specific  application. 


TREE  DISTANCES  AND  TREE  SPACES 
Many  techniques  have  been  proposed  to  measure  the  distance  between  two 
trees.  We  shall  employ  a  tree-to  -tree  distance  algorithm  descibed  by  Lu 
(1979).  The  distance  d(«,p),  where  «  and  0  are  trees,  is  defined  as  the 
minimum  number  of  node  insertions,  deletions  or  substitutions  necessary  to 
derive  one  tree  from  the  other.  When  this  distance  exists,  it  obeys  the 
following  restrictions  on  a  metric: 

1)  d(«,©<)  *  0, 

2)  d(o«,p)  ■  dO,«), 

3)  d(ot,p)  i  <K«,r)  ♦  d(T,p) 
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Given  this  metric,  we  can  begin  to  think  of  tree  spaces,  and  what  they 
represent.  Also,  we  can  answer  the  question;  what  operations  are  possible 
in  a  tree  space? 

A  tree  space  is  a  directed  graph.  Each  node  in  the  graph  represents  a 
tree,  and  edges  exiting  each  node  represent  elemetary  operations  on  that 
tree.  Figure  3  shows  a  subset  of  trees  and  the  corresponding  tree 
subspace.  Only  insertion  and  deletion  edges  are  included.  In  the  figure,  path 
b  is  shortest,  and  therefore,  the  distance  between  trees  one  and  six  is  two 
edge  traversals. 

Just  as  patterns  may  be  clustered  in  a  feature  vector  space,  trees  can  be 
clustered  In  a  tree  space.  We  shall  attack  the  problem  of  seismic  wavelet 
interpretation  by  converting  waveforms  to  trees  and  partitioning  them  into 
clusters  in  a  relational  tree  space. 


PROBLEM  STATEMENT  ANO  SOLUTION 

The  problem  we  wish  to  address  is  that  of  stratigraphically  analyzing  a 
seismic  reflection  which  may  be  distorted  by  varying  lithology,  and 
corrupted  by  noise.  Suppose  an  interpreter  knows  that  the  presence  of  a 
sand  lens  causes  an  anomalous  reflection  at  a  given  horizon.  He  has  some 
idea  of  the  structural  character  of  the  anomaly,  but  the  exact  waveform  is 
highly  variable.  The  traditional  approach  to  this  problem  is  visual 
Inspection  by  an  experienced  Interpreter. 

We  now  describe  an  automated  solution  to  this  problem  which  makes  use 
of  relational  tree  structures  and  traditional  pattern  recognition.  The 
system  requires  some  training  set  of  waveforms.  This  training  set  might 
come  from  data  collected  over  a  producing  field,  or  from  synthetic 
seismograms.  The  reflection  of  Interest  is  extracted  from  each  training 
trace  and  converted  to  its  relational  tree  representation.  Each  tree  is 
assigned  to  a  cluster  depending  on  its  character.  These  training  clusters 
are  then  used  to  design  a  tree  transformation  (Gecseg  and  Steinby,  1984] 
which  maps  the  set  of  relational  trees  to  a  subset  of  relational  trees.  This 
mapping  is  done  so  that  the  resulting  clusters  are  compact  and 
well-separated  according  to  the  tree  metric  defined  above.  The  tree 
transformation  has  the  additional  benefit  of  reducing  the  number  of  nodes  in 
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each  tree,  thereby  reducing  the  complexity  of  the  entire  system.  The  treo 
transformation  is  comparable  to  linear  discriminants  as  they  are  used  in 
feature  space  clustering.  The  reflection  of  interest  Is  then  extracted  from 
traces  where  the  geology  is  unknown.  The  extracted  wavelets  are  assigned 
to  the  existing  clusters  via  a  k-nearest-neighbor  algorithm.  If  an  unknown 
reflection's  relational  tree  falls  near  an  anomalous  cluster  in  tree  space, 
the  waveform  is  said  to  exhibit  the  same  anomaly  as  the  others  in  that 
cluster.  The  block  diagram  shown  in  figure  4  depicts  such  a  waveform 
classification  scheme. 


THE  TREE  TRANSFORMATION 

In  traditional  feature  space  clustering,  the  classification  error  is  often 
minimized  by  finding  an  appropriate  linear  transformation  on  the  feature 
space.  This  can  be  reduced  to  a  simple  unconstrained  minimization  problem. 
Since  we  lack  such  tools  as  matrix  multiplication  when  dealing  with  trees, 
finding  the  proper  transformation  to  improve  cluster  separation  becomes  a 
search  problem.  A  tree  transformation  is  based  on  a  tree  tranducer.  The 
formal  definition  of  a  tree  transformation  is  given  by  Gecseg  III],  and  lays 
the  foundation  for  ameliorating  clusters  in  a  tree  space. 

Such  a  transformation  operates  on  an  input  tree  by  starting  at  the  leaves 
and  working  upwards  to  the  root.  A  node  may  be  transformed  when  all  of  its 
children  have  been  transformed.  The  transform  actually  Inserts  states  in 
the  tree  to  act  as  place  markers  untill  a  node  is  ready  for  transformation. 
The  variables  of  the  transformation  are  known  as  rewriting  rules.  These 
rules  represent  mappings  from  subtrees  in  the  input  forest  (augmented  with 
states),  to  subtrees  in  the  output  forest.  A  transformation  is  tailored  to  a 
specific  application  by  choosing  the  proper  rewriting  rules. 


A  Tree  Clustering  Objective  Function 
Once  the  data  undergoes  a  transformation,  the  effect  on  overall  cluster 
compactness  and  separateness  must  be  assesed.  We  introduce  an  objective 
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function  here  which  is  the  ratio  of  between-class  to  within-class  scatter. 

For  the  two  class  problem,  we  define  the  within-class  scatter  for  a 
transformed  cluster  of  size  |Yi|  as: 

1 

- - 2  2  2  <tfy,x) 

|yi|  y£YI  x£Yi 

and  the  between-class  scatter  SB  as: 

1 

SB  =  -  2  2  d(xl,x2) 

(|Xl||X2|)  xl€X1x2£X2 

Xi  and  Yi  are  transformed  clusters  of  trees.  |Xj|  is  the  number  of  sample 

trees  in  XI  and  d(x,y)  is  some  metric  between  trees.  The  objective  function 
for  the  two  class  case  is: 

% 

J(X)  =  - 

(st  +  s2) 

This  may  be  easily  generalized  to  c  clusters,  where  c  is  greater  than  two. 

TREE  TRANSFORM  DESIGN 

The  transform  design  procedure  is  implemented  as  an  Al  production 
system.  Candidate  transform  rules  are  chosen  by  a  human  expert,  and  an 
optimal  path  search  algorithm  chooses  those  rules  which  maximize  the 
objective  function.  Ths  search  is  performed  such  that  preference  is  given  to 
those  tree  transforms  with  the  fewest  rules,  this  search  is  time  consuming, 
but  by  setting  an  objective  function  threshold,  the  entire  set  of  rule 
combinations  need  not  necessarily  be  considered. 

SIMULATIONS 

We  shall  now  apply  te  waveform  recognition  system  to  simulated  seismic 
data  and  observe  the  results.  Figure  5,  quoted  from  AAPG  memoir  no.  26, 
shows  a  typical  seismogram  of  a  thin  sand  Imbedded  in  a  shale.  An  expert 
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seismic  interpreter  can  easily  spot  such  an  anomaly.  A  wavelet  with  a  single 
peak  and  a  trough  becomes  a  doublet  over  the  sand  lens.  It  is  not  so  easy 
for  a  machine  to  make  such  a  qualitative  Judgement.  Due  to  varying 
frequencies,  noise  contamination,  changing  bed  thickness,  and  segmentation 
errors,  a  purely  numerical  algorithm  technique,  such  as  a  matched  filter, 
may  not  succeed  in  identifying  those  traces  that  contain  sand.  However,  this 
is  an  ideal  two-class  relational  tree  clustering  problem. 

A  seismogram  over  a  known  sand  lens  will  serve  as  a  training  set  of 
waveforms.  From  the  corresponding  relational  trees,  a  tree  transform  can 
be  found  which  improves  clustering  performance  based  on  the  objective 
function  described  earlier.  Seismic  traces  from  unknown  geology  may  then 
be  compared  to  the  two  clusters  and  classified  as  belonging  to  the  cluster  of 
their  nearest  k  neighbors.  Figure  6  depicts  schematically  the  two 
waveforms  in  question  and  their  relational  trees.  Variations  of  these  trees 
will  occur  due  to  noise,  changing  geology,  etc..  The  tree  transformation  will 
be  designed  to  eliminate  these  inhomogeneities  as  much  as  possible.  The 
rates  of  successful!  classification  will  be  compared  before  and  after 
transforming  the  tree  clustering  space  by  simulating  the  waveforms, 
distorting  them,  and  adding  noise,  then  applying  the  classification  procedure. 

The  number  of  peak  quantization  levels  used  in  this  example  was  six.  This 
provided  reasonable  identification  of  critical  peaks,  while  keeping  the  tree 
grammar  small  enough  to  work  with. 

Seismic  Classification  Results 

Waveforms  were  simulated  allowing  for  variations  in  horizontal  intervals. 
Colored  gausaian  noise  was  then  added.  Also,  waveforms  were  taken  from 
figure  5  and  tested.  Figure  7  gives  the  results  of  transformation  and 
classification  for  various  signal  to  noise  ratios. 

DISCUSSION  OF  RESULTS 

The  results  of  this  experiment  can  be  compared  to  the  existing 
numerical  techniques.  Since  within  each  signal  class  there  are  infinitely 
many  variations  in  the  waveform,  an  infinite  number  of  matched  filters 
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would  be  required  to  accurately  represent  the  problem.  Assuming  the  signal 
set  could  be  limited  to  a  finite  number  of  possible  forms,  a  bank  of  matched 
filters  would  perform  better  than  the  technique  presented  here  at  low  SNR's, 
but  not  as  well  for  high  SNR's.  When  the  amplitude  of  the  noise  becomes 
large  enough  to  distort  the  peak  dominance  relations  in  the  tree 
representation,  the  method  breaks  down.  A  further  consideration  is  the 
complexity  of  the  system.  Banks  of  matched  filters  are  difficult  to 
implement,  and  require  many  fixed  or  floating  point  operations.  The 
relational  tree  clustering  technique,  once  the  transform  operations  have 
been  selected,  requires  only  addressing  operations  and  integer 
comparisons. 

The  power  of  the  tree  transformation  might  be  enhanced  by  taking 
semantic  information  into  account,  if  the  single  attribute  "valley  height"  was 
to  be  included  at  each  non-terminal  node  in  the  relational  tree,  more 
intelligent  filtering  could  take  place.  It  was  the  inability  to  distinguish 
absolute  valley  depths  that  limited  the  success  of  the  system  at  low  signal  to 
noise  ratios.  The  inclusion  of  at  least  some  semantic  information  is 
evidently  crucial  for  the  recognition  of  complex  waveforms.  The  theory  for 
handling  semantic  information  in  the  tree  transform  needs  to  be  thoroughly 
developed  before  any  improvements  can  be  made  to  the  implementation. 

CONCLUSION 

We  have  endeavored  to  construct  a  system  which  will  identify  and  classify 
waveforms  based  on  their  underlying  structural  similarities.  The  relational 
tree  is  a  computer  data  structure  that  represents  a  waveform  by  the 
relative  placement  of  peaks  and  valleys.  We  can  treat  the  relational  tree 
much  as  a  vector  in  pattern  space.  Using  a  tree  metric  and  many  of  the 
concepts  from  traditional  cluster  analysis,  we  have  designed  a  waveform 
recognition  system  which  employs  a  tree  transformation. 

After  implementing  the  waveform  recognition  system  and  testing  it  on 
simulated  reflection  seismic  data,  the  following  observations  can  be  made. 

(1)  The  symbolic  recognition  system  in  its  present  form  is  only  feasible  if 
the  tree  complexity  is  low,  i.e.  the  signal  contains  a  small  number  of  peaks 
and  valleys. 
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(2)  For  these  waveforms,  the  classification  error  is  equal  to  or  better 
than  numerical  techniques  at  low  signal  to  noise  ratios,  but  abruptly 
becomes  worse  as  relative  peak  and  valley  heights  are  altered  by  noise. 

(3)  The  transform  effectiveness  could  be  greatly  enhanced  by  adding 
semantic  information,  but  the  theory  governing  such  a  transform  has  yet  to 
be  developed. 
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Figure  2 

Modified  Peak  Labeling  Scheme 
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Figure  3 

A  Tree  Subspace.  Path  b  is  the  minimum  distance. 
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Figure  4 

The  Relational  Tree  Waveform  Classification  System 
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Figure  5 

Seismogram  over  a  sand  imbedded  in  shale  (after  Neidell  (t977)) 
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Figure  6 

Seismic  Waveforms  and  Relational  Trees 
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Figure  7 

Results  of  classifying  seismic  wavelets  with  the  relational  tree  system 
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Seismogram  over  a  sand  imbedded  in  shale  (after  Neidell  (1977)) 


23 


Seismic  Stratigraphic  Interpretation 


November  25, 1986 


•  .  »*-  .v.v7. 


V  ••- 

t' 


t  * 


